README for "The Impact of University Attendance on Partisanship"
Apfeld, Coman, Gerring and Jessee
Political Science Research and Methods


The results presented in the paper are based on multiple datasets:
- the scores on the Romanian baccalaureate examination between 2004 and 2019 (bac_scores.Rdata)
- a survey of Romanians conducted by the authors (final_data.dta)
- Short, anonymised version of the bac results dataset (replication_dataset.dta)
- the Barro-Lee Educational Attainment Dataset (BL2013_MF_v2.2.csv)
- the World Values Survey Cross-National Wave 7 (WVS_Cross-National_Wave_7_Stata_v1_5.dta)

These datasets (with modifications to ensure anonymity in some cases, none of which affects the results of the paper) are included here. To reproduce the paper's results, you should ensure that all datasets as well as the four code files are all downloaded to the same directory before running the code. Also note that some packages (described at the top of the respective code files) must be installed before running the code.

Below is a description of the code files and what results they produce:

"replication-code.R"
R code that reproduces all results based on the survey of Romanians used for the regression discontinuity analyses in the paper. This includes all tables and figures from the main paper and all tables and figures from the appendix except those mentioned below as being produced by other code files. Requires data file "final_data.dta"

"figures_B1_B2.R"
R code that reproduces all results based on the Bac test results. This includes Figure B1 and Figure B2 in the appendix. Note that "rdplotdensity_two_lines.R" provides a modified plotting function required to produce Figure 2 and this file should be downloaded in the same directory as other code files (formally, it must be in R's working directory when "figure_1_2.R" is run in order to produce the figure). Requires data file "bac_scores.Rdata"
Note that because Appendix F is a reprint of our study's pre-registration, the figures F1 and F2 are identical to figures B1 and B2, respectively, which appear earlier in the appendix.

"figure_d1.R"
R code that reproduces figure D1 in the Appendix. This script automatically downloads the Barro-Lee dataset used to plot the percentage of the population of each country with partial or complete tertiary education.

"table_b1.do"
Stata do file that produces some of the results of Table B1 for columns labeled "Pop" and "Pop<24". Requires data file "WVS_Cross-National_Wave_7_Stata_v1_5.dta" (NOTE: the RNE sample percentages, listed under the columns labeled "Sample", are replicated as part of the file "replication-code.R")

"table_b2.do"
Stata do file that produces the results of Table B2 . Requires data file "table_b2.dta"

"table_g1.do"
Stata do file that produces the results of Table G1 . Requires data file "table_g1.dta"
